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This tutorial gives a hands-on introduction to a numerical method for solving 
1 e— | ' integral equations stemming from elliptic partial differential equations on 

Q-j piecewise smooth domains. We will draw heavily from the existing literature, 

especially Helsing and Ojala [191 ED] and Helsing [13l [El HE] • Demo codes, 
in Matlab, are a part of the exposition and can be downloaded from: 
http : //www . maths . 1th. se/na/ staf f /helsing/Tutor/ 



1 Introduction 

The numerical method we describe has previously been referred to as recur- 
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sive compressed inverse preconditioning [19] and there is a good reason for 
this name: the method relies on applying a relieving right inverse to the in- 
tegral equation; on compressing this inverse to a low-dimensional subspace; 
and on carrying out the compression in a recursive manner. Still, the name 
recursively) compressed inverse preconditioning is a bit awkward and we 
will here refer to the method simply as the acronym RCIP. 

As is often the case when explaining algorithms of scientific comput- 
ing, the full story requires intermediate constructions that are not used in 
the final implementations. We shall downplay the role of such construc- 
tions and concentrate on working codes. Section [2] provides some historical 
background. The RCIP method is then explained in Sections I3HT51 Sec- 
tions [16HIH1 contain new applications to scattering problems. 

2 Background 

The line of research on fast solvers for elliptic boundary value problems on 
piecewise smooth domains, leading up to the RCIP method, grew out of 
work in computational fracture mechanics. Early efforts concerned finding 
efficient integral equation formulations. Corner singularities were either re- 
solved with brute force or by using special basis functions, see [121 EU 123] 
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for examples. Such strategies, in combination with fast multipole flOj accel- 
erated iterative solvers, work well for simple small-scale problems. 

Real world physics is more complicated and, for example, the study [9j 
on a high-order time-stepping scheme for crack propagation (a series of bi- 
harmonic problems for an evolving piecewise smooth surface) shows that 
radically better methods are needed. Special basis functions are too com- 
plicated to construct and brute force is not economical - merely storing the 
discretized solution becomes too costly in a large-scale simulation. 

A breakthrough came in 2007, when a scheme was created that resolves 
virtually any problem for Laplace's equation on piecewise smooth two di- 
mensional domains in a way that is fully automatic, fast, stable, memory 
efficient, and whose computational cost scales linearly with the number of 
corners in the computational domain. The resulting paper [19] constitutes 
the origin of the RCIP method. Unfortunately, however, there are some 
flaws in [19]. The expressions in [19\ Section 9] are not generally valid, the 
notation is clumsy, and the paper fails to apply RCIP in its entirety to the 
biharmonic problem of [19\ Section 3], which was the ultimate goal. 

The second paper on RCIP deals with elastic grains [2D]. The notation 
is still not ideal and singular integral operators are discretized in needlessly 
complicated ways. The part (2DJ Appendix B], on speedup, is useful. 

The third paper on RCIP contains improvement both in the notation 
and in the discretization of singular operators [13]. The overall theme is 
mixed boundary conditions, which pose similar difficulties as do piecewise 
smooth boundaries. 

The paper [13], finally, solves the problem of [121 Section 3] in a broad 
setting. One can therefore view the formalism of [14J as the RCIP method 
in its most general form, at least in two dimensions. Further work on RCIP 
has been devoted to more general boundary conditions [28J, to problems 
in three dimension [2T], and to solving elliptic boundary value problems in 
special situations, such as for aggregates of millions of grains, where special 
techniques are introduced to deal with problem specific issues [131 [TP]. 

We end this retrospection by pointing out that several research groups in 
recent years have proposed numerical schemes for integral equations stem- 
ming from elliptic partial differential equations on piecewise smooth do- 
mains. See, for example, [U [2J [3l [3J [51 [6] . There is also a widespread notion 
that a slight rounding of corners is a good idea for numerics. While rounding 
may work in particular situations, we do not believe it is a generally viable 
method. For one thing, how does one round a triple junction? 

3 A model problem 

Let r be the closed contour of Figure [1] with the parameterization 

r(s) = sin(vrs) (cos((s - 0.5)6), sin((s - 0.5)6*)) , s 6 [0, 1] . (1) 
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Figure 1: The contour T of (UJ) with a corner at the origin. The solid curve 
corresponds to opening angle 6 = tt/3. The dashed curve has 6 = 4ir/3. 

Let G(r, r') be the fundamental solution to Laplace's equation which in the 
plane can be written 

G(r, r') = log \r — r'\ . (2) 

We shall solve the integral equation 

p(r) + 2A / jr-Gir, r')p(r') da r , = 2A (e • u r ) , r G T , (3) 

for the unknown layer density p(r). Here v is the exterior unit normal of 
the contour, da is an element of arc length, A is a parameter, and e is a unit 
vector. 

Using complex notation, where vectors r, r', and v in the real plane M 2 
correspond to points z, r, and n in the complex plane C, one can express (|3|) 
as 

p(z) + ^Jp^i^^\=2XR{en M }, zeT, (4) 

where the 'bar' symbol denotes complex conjugation. Equation Q is a 
simplification over ([3]) from a programming point of view. 

From an algorithmic point of view it is advantageous to write ([3]) in the 
abbreviated form 



(I + XK) p(z) = Xg(z) , zer, 



(5) 




Figure 2: Left: A coarse mesh with ten panels on the contour T of ([!]) with 
opening angle 9 = ir/2. A subset of T, called T*, covers four coarse panels 
as indicated by the dashed curve. Right: A fine mesh created from the coarse 
mesh by subdividing the panels closest to the corner n su b = 3 times. 

where I is the identity. If T is smooth, then ([5]) is a Fredholm second 
kind integral equation with a compact, non-self-adjoint, integral operator K 
whose spectrum is discrete, bounded by one in modulus, and accumulates 
at zero. 

We also need a way to monitor the convergence of solutions p{z) to ([5]). 
For this purpose we introduce the quantity 

q = J p(z)${ez}d\z\ . (6) 



4 Discretization on two meshes 

We discretize © using the original Nystrom method based on composite 16- 
point Gauss-Legendre quadrature on two different meshes: a coarse mesh 
with n pan quadrature panels and a fine mesh which is constructed from the 
coarse mesh by n su b times subdividing the panels closest to the corner in a 
direction toward the corner. The discretization is in parameter. The four 
panels on the coarse mesh that are closest to the corner should be equisized 
in parameter. These panels form a subset of V called T*. See Figure [2j 

The linear systems resulting from discretization on the coarse mesh and 
on the fine mesh can be written formally as 

(I coa + AK coa ) p coa = Ag coa , (7) 

(Ifin + AK fin ) p fin = Ag fin , (8) 

where I and K are square matrices and p and g are column vectors. The 
subscripts fin and coa indicate what type of mesh is used. Discretization 
points on a mesh are said to constitute a grid. The coarse grid has n p = 
16n pan points. The fine grid has n p = 16(n pan + 2n su b) points. 
The discretization of ([5]) is carried out by first rewriting (Tjrj) as 

p{z(s)) + - ^ p(r(t))ml _ )= 2A3f? {^)} . «e [0,1]. (9) 
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Figure 3: Convergence for q of © using © and the program demo lb. m (a 
loop of demol.m) with lambda=0 . 999, theta=pi/2, npan=10, and evec=l. 
The reference value is q = 1.1300163213105365. There are n p = 160 + 32n su b 
unknowns in the main linear system. Left: Convergence with n su b. Right: The 
number of iterations needed to meet an estimated relative residual of e mac h. 

Then Nystrom discretization with n p points Zj and weights Wi on T gives 

\ ^ f Tl'l I IV ' ^ 

Pi + ~Y,P^{ — 3 \=^{e ni } , i = 1,2,. ..,n p . (10) 

The program demol.m sets up the system ([8]), solves it using the GM- 
RES iterative solver |29| incorporating a low-threshold stagnation avoiding 
technique [18j Section 8] , and computes q of ([6]) . The user has to specify the 
opening angle 9, the parameter A, the number n pan of coarse panels on T, 
the unit vector e and the number of subdivisions n su b- The opening angle 
should be in the interval vr/3 < 6 < 5vr/3. We choose A = 0.999, 6 = vr/2, 
"-pan = 10, and e = 1. The quantity q converges initially as n su b is increased, 
but for n su b > 44 the results start to get worse. See Figure [3j This is related 
to the fact, pointed out by Bremer [2], that the original Nystrom method 
captures the L°° behavior of the solution p, while our p is unbounded. See, 
further, Appendix D. 



5 Compressed inverse preconditioning 

Let us split the matrices K coa and Kg n of ([7]) and ([SJ) into two parts each 

K coa = K* oa + K c ° oa (11) 
K fin = K£ n + K£ n . (12) 
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Here the superscript * indicates that only entries of a matrix whose 
indices i and j correspond to points Zi and Zj that both belong to the 
boundary segment T* are retained. The remaining entries are zero. 

Now we introduce two diagonal matrices W coa and Wg n which have the 
quadrature weights tt>, on the diagonal. Furthermore, we need a prolongation 
matrix P which interpolates functions known at points on the coarse grid to 
points on the fine grid. The construction of P relies on panelwise 15-degree 
polynomial interpolation in parameter using Vandermonde matrices. We 
also construct a weighted prolongation matrix P^y via 

P W = WfinPW^ . (13) 

The matrices P and Pw share the same sparsity pattern. They are rect- 
angular matrices, similar to the the identity matrix, but with one full (4 + 
2n su b)16 x 64 block. Let superscript T denote the transpose. Then 

P^P = I coa (14) 

holds exactly. See Appendix A and [TH Section 4.3]. 

Equipped with P and Pw we are ready to compress (JSj) on the fine grid to 
an equation essentially on the coarse grid. This compression is done without 
the loss of accuracy (the discretization error in the solution is unaffected) 
and relies on the variable substitution 

(I fin + AKg n ) p fin = Pp coa . (15) 

Here p coa is the discretization of a piecewise smooth transformed density. 
The compression also uses the low-rank decomposition 

Kfi n = PK° oa P;, (16) 

which should hold to about machine precision. 
The compressed version of ([8|) reads 

(Icoa + AK° oa R) p coa = Ag coa , (17) 

where the compressed weighted inverse R is given by 

R = P^(I fin + AK^ n )- 1 P. (18) 

See Appendix B for details on the derivation. The compressed weighted 
inverse R, for T of ([T]) , is a block diagonal matrix with one full 64 x 64 block 
and the remaining entries coinciding with those of the identity matrix. 

It is important to observe that p fin is not always needed in computations. 
For example, the quantity q of ([6]) can be computed directly from p coa . Let 
£ coa be a column vector which contains the absolute values of the boundary 
derivatives z\ multiplied with weights Wi, positions Zi, and the complex scalar 
e. Then 

g = ^{Ccoa} T RPcoa- (19) 
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Figure 4: Left: The prolongation operator Pb c performs panelwise interpolation 
from a grid on a four-panel mesh to a grid on a six-panel mesh. Right: The 
sparsity pattern of Pb c - 

6 The recursion 

The compressed weighted inverse R is costly to compute from its defini- 
tion (|18p . As we saw in Section 01 the inversion of large matrices (I + K) on 
highly refined grids could also be unstable. Fortunately, the computation of 
R can be greatly sped up and stabilized via a recursion. This recursion is 
derived in rather clumsy notation and using a refined grid that differs from 
that of the present tutorial in |19l Section 7.2]. A better derivation can be 
found in [14, Section 5], but there the setting is more general so that text 
could be hard to follow. Here we focus on results. 

6.1 Basic prolongation matrices 

Let Pbc be a prolongation matrix, performing panelwise 15-degree polyno- 
mial interpolation in parameter from a 64-point grid on a four-panel mesh 
to a 96-point grid on a six-panel mesh as shown in Figure HI Let Pyt/bc be 
a weighted prolongation matrix in the style of (|13p . If T16 and W16 are the 
nodes and weights of 16-point Gauss-Legendre quadrature on the canonical 
interval [—1,1], then Pb c and P^/bc can be constructed as 

T32=[T16-l;T16+l]/2; 
W32=[W16;W16]/2; 
A=ones (16) ; 
AA=ones(32, 16) ; 
for k=2:16 

A(: ,k)=A(: ,k-l) .*T16; 

AA(: ,k)=AA(: ,k-l) .*T32; 
end 
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IP=AA/A; 

IPW=IP.*(W32*(1./W16) ') ; 

% 

Pbc=zeros(96,64) ; 

Pbc( 1:16, l:16)=eye(16) ; 

Pbc(17:48,17:32)=IP; 

Pbc (49 : 96 , 33 : 64) =f lipud(f liplr (Pbc ( 1 : 48 , 1 : 32) ) ) ; 

% 

PWbc=zeros(96,64) ; 

PWbc( 1:16, 1: 16)=eye(16) ; 

PWbc(17:48,17:32)=IPW; 

PWbc (49 : 96 , 33 : 64) =flipud(f liplr (PWbc (1 : 48 , 1 : 32) ) ) ; 

A minor improvement in accuracy can be obtained by constructing Pb c and 
Pwbc in extended precision and then rounding the result to double precision. 

6.2 Discretization on nested meshes 

Let T*, i = 1,2, ... , n sub , be a sequence of subsets of T* with r*_ x C T* 
and T* sub = r*. Let there also be a six-panel mesh and a corresponding 
96-point grid on each r*. The construction of the subsets and their meshes 
should be such that if z(s), s £ [—2,2], is a local parameterization of T*, 
then the breakpoints (locations of panel endpoints) of its mesh are at s £ 
{—2,-1,-0.5,0,0.5,1,2} and the breakpoints of the mesh on are at 
s = {-1, -0.5, -0.25,0, -0.25,0.5, 1}. We denote this type of nested six- 
panel meshes type b. The index i is the level. An example of a sequence of 
subsets and meshes on T* is shown in Figure for n su b = 3. Compare [T3J 
Figure 2] and [HI Fig 5.1]. 

Let Kjb denote the discretization of K on a type b mesh on T*. In the 
spirit of (|ll|12p we write 

K, b = K* b + K° b , (20) 




Figure 5: The boundary subsets Tg, Y\, and T\ along with their corresponding 
type b meshes for n sub = 3. 
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Figure 6: Same as Figure[3l but using (II 7 1) and the program demo2b.m (a loop 
of demo2 .m). There are only n p = 160 unknowns in the main linear system. 



where the superscript * indicates that only entries with both indices corre- 
sponding to points on the four inner panels are retained. 

6.3 The recursion proper 

Now, let R„ siib denote the full 64 x 64 diagonal block of R. The recursion 
for R„ siib is derived in Appendix C and it reads 

R, = P^ bc (F{Rr\} + 1° + AK? b ) _1 P bc , i = 1, . . . , n sub , (21) 
F{R 1 } = lS + AKt b , (22) 

where the operator F{-} expands its matrix argument by zero-padding (adding 
a frame of zeros of width 16 around it). Note that the initializer (j22[) makes 
the recursion (f2Tj) take the first step 

Ri = P^bc (lb + AKib)- 1 P bc . 

The program demo2.m sets up the linear system (|17p . runs the recur- 
sion (|21l22p . and solves the linear system using the same techniques as 
demol.m, see Section 21 In fact, the results produced by the two programs 
are very similar, at least up to n sub = 40. This supports the claim of Sec- 
tion [5] that discretization error in the solution is unaffected by compression. 

Figure demonstrates the power of RCIP: fewer unknowns and faster 
execution, better conditioning (the number of GMRES iterations does not 
grow) , and higher achievable accuracy. Compare Figure We emphasize 
that the number n sub of recursion steps (levels) used in (f2Tj) corresponds to 
the number of subdivisions n sub used to construct the fine mesh. 
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Convergence of q with mesh refinement 




GMRES iterations for full convergence 
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Figure 7: Same as Figure^ but the program demo3b.m is used. 
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7 Schur— Banachiewicz speedup of the recursion 

The recursion (|2ip can be sped up using the Schur-Banachiewicz inverse for- 
mula for partitioned matrices [24], which in this context can be written |20t 
Appendix B] 









I 



A- 1 
V 



u 

D 



n -1 



P* 
I 



P^AP* + P*f AU(D - VAU)- 1 VAP* -P^AU(D - VAU)- 1 
— (D — VAU) _1 VAP* (D-VAU) -1 

(23) 

where A plays the role of Rj_i, P* and P^ are submatrices of Pb c and 
PvKbc, and U, V, and D refer to blocks of AK° b . 

The program demo3.m is based on demo2.m, but has (|23p incorporated. 
Besides, the integral equation © is replaced with 

p(r)+2X f -£-G(r, r')p(r') do r ,+ ! p(r') da r , = 2A (e • v r ) , r G V , (24) 
Jr our Jr 

which has the same solution p(r) but is more stable for A close to one. For 
the discretization of (|24p to fit the form (|17p . the last term on the left hand 
side of dMD is added to the matrix AK° oa of (fTTD . 

The execution of demo3 . m is faster than that of demo2 . m. Figure [7] shows 
that a couple of extra digits are gained by using (|24p rather than ([3]) and 
that full machine accuracy is achieved for n su b > 60. 
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8 Various useful quantities 



Let us introduce a new discrete density p coa via 

Pcoa = Rpcoa- ( 2 5) 

Rewriting (fT7|) in terms of p coa gives 

(R- 1 + AK° oa ) p coa = Ag coa , (26) 

which resembles the original equation ([7]). We see that K° oa , which is dis- 
cretized using Gauss-Legendre quadrature, acts on p coa . Therefore one can 
interpret p coa as pointwise values of the original density p(r), multiplied with 
weight corrections suitable for integration against polynomials. We refer to 
p coa as a weight- corrected density. Compare [20 s Section 5.4]. 

Assume now that there is a square matrix S which maps p coa to discrete 
values p coa of the original density on the coarse grid 

Pcoa = SPcoa • ( 2 ?) 

The matrix S allows us to rewrite (|lTj) as a system for the original density 

(S- 1 + AK^RS- 1 ) p coa = Ag coa . (28) 

We can interpret the composition RS^ 1 as a matrix of multiplicative weight 
corrections that compensate for the singular behavior of p(r) on T* when 
Gauss-Legendre quadrature is used. 
Let Y denote the rectangular matrix 

Y = (I fin + AK^ n )- 1 P, (29) 

and let Q be a restriction operator which performs panelwise 15-degree 
polynomial interpolation in parameter from a grid on the fine mesh to a 
grid on a the coarse mesh. We see from (|15p that Y is the mapping from 
p coa to p fin . Therefore the columns of Y can be interpreted as discrete basis 
functions for p(r). It holds by definition that 

QP = Icoa , (30) 

QY = S. (31) 

The quantities and interpretations of this section come in handy in var- 
ious situations, for example in 3D extensions of the RCIP method |21j . An 
efficient scheme for constructing S will be presented in Section [TUJ 
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9 Reconstruction of p fin from p 



coa 



The action of Y on p coa , which gives p fin , can be obtained by, in a sense, 
running the recursion (|2ip backwards. The process is described in detail 
in [131 Section 7]. Here we focus on results. 
The backward recursion on V* reads 

Pcoa.i = !b - AK° b (F{R^1 1 1 } + 1£ + AK° b ) 1 PbcP CO a,j ! i = n sub ,...,l. 

(32) 

Here p coa j is a column vector with 64 elements. In particular, p coajnsub is 
the restriction of p coa to T*, while p coai are taken as elements {17 : 80} of 
Pcoai+i f° r * < n sub- The elements {1 : 16} and {81 : 96} of p coai are the 
reconstructed values of p fin on the outermost panels of a type b mesh on V*. 
Outside of T*, p fin coincides with p coa . 

When the recursion is completed, the reconstructed values of p fin on the 
four innermost panels are obtained from 

R 0Pcoa,0 • ( 33 ) 

Should one wish to interrupt the recursion (|32p prematurely, at step i = j 
say, then 

Pcoa, (i-1) (34) 

gives values of a weight-corrected density on the four innermost panels of 
a type b mesh on T^. That is, we have a part-way reconstructed weight- 
corrected density p part on a mesh that is n SUD — j + 1 times refined. This 
observation is useful in the context of evaluating layer potentials close to 
their sources. 

If the memory permits, one can store the matrices K° b and Rj in the 
forward recursion (|2ip and reuse them in the backward recursion ()32|) . Oth- 
erwise they may be computed afresh. 

The program demo4 .m builds on the program demo3 .m, using (|17j) for (|24p . 
After the main linear system is solved for p coa , a postprocessor reconstructs 
p fin via (|32p . Then a comparison is made with a solution p fin obtained by 
solving the un-compressed system flSJ). Figure [8] shows that for n SUD < 10 
the results are virtually identical. This verifies the correctness of (|32p . For 
n sub > 10 the result start to deviate. That illustrates the instabilities asso- 
ciated with solving ([8|) on a highly refined mesh. Compare Figure O 

The program demo5.m investigates the effects of premature interruption 
of ([32]) . The number of recursion steps is set to n su b = 100 and the recursion 
is interrupted at different levels. The density p fin is reconstructed on outer 
panels up to the level of interruption. Then a weight-corrected density is 
produced at the innermost four panels according to (f34"|) . Finally q of ([6]) 
is computed from this part-way reconstructed solution. The right image of 
Figure [8] shows that the quality of q is unaffected by the level of interruption. 
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Figure 8: Output from demo4.m and demo5.m. Left: A comparison of p fin 
from the unstable equation © and p fin reconstructed from p coa of (|17l) via Q32I) . 
Right: Relative accuracy in q of © from part-way reconstructed solutions p part . 



10 The construction of S 

This section discusses the construction of S and other auxiliary matrices. 
Note that in many applications, these matrices are not needed. 

The entries of the matrices P, Pry, Q, R, S, and Y can only differ from 
those of the identity matrix when both indices correspond to discretization 
points on T*. For example, the entries of R only differ from the identity 
matrix for the 64 x 64 block denoted Rn sub in (|2ip . In accordance with 
this notation we introduce Pn sub , Pifn s „ b , Qn sub , S„ sub and Y nsub for the 
restriction of P, P^, Q, S and Y to T*. In the codes of this section we 
often use this restricted type of matrices, leaving the identity part out. 

We observe that S„ sub is a square 64 x 64 matrix; P„ sub , P\y nBub and 
Y„ sub are rectangular 16(4 + 2n su b) x 64 matrices; and Qn sub is a rectangular 
64 x 16(4 + 2n su b) matrix. Furthermore, Qn sub is very sparse for large n su b- 
All columns of Qn sub with column indices corresponding to points on panels 
that result from more than eight subdivisions are identically zero. 

The program demo6.m sets up P„ sub , Pwn Bub and Qn sub , shows their 
sparsity patterns, and verifies the identities (|14[) and f|30j) . The implemen- 
tations for Pn sub and ~Pw naub rely on repeated interpolation from coarser 
to finer intermediate grids. The implementation of Qn sub relies on keeping 
track of the relation between points on the original coarse and fine grids. 
Output from demo6.m is depicted in the left image of Figure [9j Note that 
the matrices Pn sub and Pwn suh are never needed in applications. 

We are now ready to construct S. Section [9] presented a scheme for 
evaluating the action of Y„ sub on discrete functions on the coarse grid on 
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Figure 9: Left: The identities (I14I) and fj30l) hold to high accuracy in our 
implementation, irrespective of the degree of mesh refinement. Right: The 
solution p to (j24p on iJT]) with parameters as specified in Section 0] The solution 
with RCIP (TP7]) and ([27]) . shown as blue stars, agrees with the solution from (J8]), 
shown as a red solid line. The solution diverges in the corner. 

r*. The matrix Y„ sub , itself, can be constructed by applying this scheme 
to a 64 x 64 identity matrix. The matrix Qn sub was set up in demo6.m. 
Composing these two matrices gives S„, sub , see (J3TJ) . This is done in the 
program demo7.m, where the identity part is added as to get the entire 
matrix S. In previous work on RCIP we have found use for S in complex 
situations where ([28]) is preferable over ([T7]) , see [211 Section 9] . If one merely 
needs p coa from p coa in a post-processor, setting up S and using ([2"T]) is not 
worthwhile. It is cheaper to let Y act on p coa and then let Q act on the 
resulting vector. Anyhow, demo7.m builds on demo4.m and gives as output 
p coa computed via ([27]) . see the right image of Figure [9] For comparison, 
p fin , computed from the un-compressed system (JSj , is also shown. 

11 Initiating R using fixed-point iteration 

It often happens that V* is wedge-like. A corner of a polygon, for example, 
has wedge-like T* at all levels. If T* is merely piecewise smooth, then the 
T* are wedge- like to double precision accuracy for i <C n su t>- 

Wedge-like sequences of T* open up for simplifications and speedup in 
the recursion ()21l22p . Particularly so if the kernel of the integral operator 
K of ([5]) is scale invariant on wedges. Then the matrix K° b becomes in- 
dependent of i. It can be denoted K£ and needs only to be constructed 
once. Furthermore, the recursion ([21 (22j) assumes the form of a fixed-point 
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Convergence of q with mesh refinement 



20 40 60 80 

Number of subdivisions n t 

sub 



100 



GMRES iterations for full convergence 



10 



10 



20 40 60 80 100 

Number of subdivisions n t 

sub 



Figure 10: Same as Figure [Jj but the program demo8b.m is used. 



iteration 



R l = P^ bc (F{Rr\} + I°+AKy 



be 



F{R^} = l£ + AK£. 



(35) 
(36) 



The iteration ([35]) can be run until Rj converges. Let R* be the converged 
result. One need not worry about predicting the number of levels needed. 
Choosing the number n su b of levels needed, in order to meet a beforehand 
given tolerance in R* , is otherwise a big problem in connection with (|2 1I22|) 
and non- wedge- like T*. This number has no general upper bound. 

Assume now that the kernel of K is scale invariant on wedges. If all T* are 
wedge-like, then (|35I36|) replaces (|21I22|) entirely. If T* is merely piecewise 
smooth, then ()35l36p can be run on a wedge with the same opening angle 
as r*, to produce an initializer to (|21|) . That initializer could often be more 
appropriate than (|22p , which is plagued with a very large discretization error 
whenever (|10p is used. This fixed-point recursion initializer is implemented 
in the program demo8b.m, which is a simple upgrading of demo3b.m, and 
produces Figure [TO] A comparison of Figure [10] with Figure [7J shows that 
the number n su b of levels needed for full convergence is halved compared to 
when using the initializer (|22p . 

There are, generally speaking, several advantages with using the fixed- 
point recursion initializer, rather than (|22p . in (|2ip on a non- wedge-like T*: 
First, the number of different matrices Ri and K° b needed in (|21|) and in (|32|) 
is reduced as the recursions are shortened. This means savings in storage. 
Second, the number n su b of levels needed for full convergence in (|2ip seems 
to always be bounded. The hard work is done in (|35p . Third, Newton's 
method can be used to accelerate ([35]) . That is the topic of Section [T2l 
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Fixed-point iteration versus Newton's method 




Figure 11: Output from the program demo9.m. The fixed-point iteration 
f)35l36|) is compared to Newton's method for ([57]) . 



12 Newton acceleration 

When solving integral equations stemming from particularly challenging el- 
liptic boundary value problems with solutions p(r) that are barely absolutely 
integrable, the fixed-point iteration (j35!36[) on wedge-like T* may need a very 
large number of steps to reach full convergence. See |16[ Section 6.3] for an 
example where 2 • 10 5 steps are needed. 

Fortunately, (|35p can be cast as a non-linear matrix equation 

G(R*) = P^ bc A(R,)P bc -R* = 0, (37) 

where R*, as in Section PTT1 is the fixed-point solution and 

A(R») = (FIR; 1 } + I° h + AK b ) 1 . (38) 

The non-linear equation (|37p . in turn, can be solved for R* with a variant 
of Newton's method. Let X be a matrix-valued perturbation of R* and 
expand G(R* + X) = to first order in X. This gives a Sylvester- type 
matrix equation 

X - P^ bc A(R,)F{R; 1 XR; 1 }A(R,)P bc = G(R*) (39) 

for the Newton update X. One can use the Matlab built-in function dlyap 
for ([39]) . but GMRES seems to be more efficient and we use that method. 
Compare [16j Section 6.2]. 
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Convergence of with mesh refinement 
10° i r-^ , , r— 



* <7 
o U(i) 




20 40 60 80 100 

Number of subdivisions n „ 

sub 



Figure 12: Similar as Figure [7j left image, but with the program demo3c.m. 
The potential U(r) at r = (0.4,0.1) is evaluated via (1401) and npan=14 is used. 

Figure [TTJ shows a comparison between the fixed-point iteration (|35|36p 
and Newton's method for computing the fixed-point solution R* to (f37"|) on 
a wedge-like T*. The program demo9.m is used and it incorporates Schur- 
Banachiewicz speedup in the style of Section [JJ The wedge opening angle is 
9 = it/2, The integral operator K is the same as in ([5]), and A = 0.999. The 
relative difference between the two converged solutions R* is 5.6 • 10~ 16 . 
Figure [11] clearly shows that ()35|36j) converges linearly (in 68 iterations) , 
while Newton's method has quadratic convergence. Only four iterations 
are needed. The computational cost per iteration is, of course, higher for 
Newton's method than for the fixed-point iteration, but it is the same at 
each step. Recall that the underlying size of the matrix (Ig n + AKg n ), that 
is inverted according to (|18p . grows linearly with the number of steps needed 
in the fixed-point iteration. This example therefore demonstrates that one 
can invert and compress a linear system of the type (|18|) in sub-linear time. 

13 On the accuracy of "the solution" 

The integral equation ([3]) comes from a boundary value problem for Laplace's 
equation where the potential field U{r) at a point r in the plane is related 
to the layer density p(r) via 

U(r) = (e • r) - J G(r, r')p(r') &a r , , (40) 

see [IS Section 2.1]. 

Figure [12] shows how the field solution U (r) of (|40p converges with mesh 
refinement at a point r = (0.4,0.1) inside the contour T of (fT]). We see 
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that the accuracy in U(r), typically, is one digit better than the accuracy 
of the dipole moment Q of dH]). One can therefore say that measuring the 
field error at a point some distance away from the corner is more forgiving 
than measuring the dipole moment error. It is possible to construct examples 
where the difference in accuracy between field solutions and moments of layer 
densities are more pronounced and this raises the question of how accuracy 
best should be measured in the context of solving integral equations. We do 
not have a definite answer, but think that higher moments of layer densities 
give more fair estimates of overall accuracy than do field solutions at points 
some distance away from the boundary. See, further, Appendix E. 



14 Composed integral operators 

Assume that we have a modification of (JSJ which reads 
(I + MK) Pl (z) =g(z), zGT. 



(41) 



Here K and g are as in ([5]), M is a new, bounded, integral operator, and 
pi is an unknown layer density to be solved for. This section shows how to 
apply RCIP to (|4ip using a simplified version of the scheme in |14j . 

Let us, temporarily, expand (|41|) into a system of equations by introduc- 
ing a new layer density P2(z) = Kp±(z). Then 



Pl {z) + Mp 2 (z) 
-K Pl {z)+ p 2 (z) 

and after discretization on the fine mesh 

Ofin M fin " 
-Kfin Ofi n 



9(z) 

0, 





Ifin 


Ofin " 


+ 


( 


. fin 


Ifin 



) 


Plfin 




gfin 




P2fin 








(42) 
(43) 



(44) 



Standard RCIP gives 



Icoa coa 
Onna. Ir.oa 



+ 



u coa 

K° 



M° 

OroR 



Ri 
R 2 



R 3 
R4 



Plcoa 
P2coa 



gcoa 





(45) 



where the compressed inverse R is partitioned into four equisized blocks. 
Now we replace p lcoa and p 2coa with a single unknown p coa via 



Plcoa 
P2coa 



Pcoa xl "l 
KcoaR-lPc 



R1 RsKcoaR-l 

rcoa ' 



(46) 
(47) 



The change of variables (|46l47p is chosen so that the second block-row of (|45p 
is automatically satisfied. The first block-row of (|45[) becomes 



(48) 



[icoa + M° oa (R 4 - R 2 R X R3) K° oa Ri + M° oa R 2 

-R7 R 3 K° Ri] p c 
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Convergence of q with mesh refinement Convergence of q with mesh refinement 




20 40 60 80 100 20 40 60 80 100 

Number of subdivisions n t Number of subdivisions n t 

sub sub 



Figure 13: Convergence for q of (0) with p = pi from (j4"Tp . The curve T is as 
in ([!]) and theta=pi/2, npan=ll, and evec=l. The reference values is taken 
as q = 1.95243329423584. Left: Results from the inner product preserving 
scheme of Appendix D produced with demolO.m. Right: Results with RCIP 
according to ()48l49p produced with demolOb.m 

When (T48|) has been solved for p coa , the weight-corrected, version of the 
original density p\ can be recovered as 

Plcoa = R-lPcoa ■ ( 4 9) 

Figure [131 shows results for (|4ip with M being the double layer potential 



Mp{z) = - 




The left image shows the convergence of q of (0) with n su b using the in- 
ner product preserving discretization scheme of Appendix D for f)41[) as 
implemented in demolO.m. The right image shows q produced with RCIP 
according to ([48I49P as implemented in demolOb.m. The reference value 
for q is computed with the program demolOc.m, which uses inner product 
preserving discretization together with compensated summation |25[ [26] in 
order to enhance the achievable accuracy. One can see that, in addition to 
being faster, RCIP gives and extra digit of accuracy. Actually, it seems as 
if the scheme in demolO.m converges to a q that is slightly wrong. 

In conclusion, in this example and in terms of stability, the RCIP method 
is better than standard inner product preserving discretization and on par 
with inner product preserving discretization enhanced with compensated 
summation. In terms of computational economy and speed, RCIP greatly 
outperforms the two other schemes. 
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15 Nystrom discretization of singular kernels 



The Nystrom scheme of Section 0] discretizes ([5]) using composite 16-point 
Gauss-Legendre quadrature. This works well as long as the kernel of the 
integral operator K and the layer density are smooth on smooth V. When 
the kernel is not smooth on smooth T, the quadrature fails and something 
better is needed. See for a comparison of the performance of various 
modified high-order accurate Nystrom discretizations for weakly singular 
kernels and [22] for a recent high-order general approach to the evaluation 
of layer potentials. 

We are not sure what modified discretization is optimal in every situ- 
ation. When logarithmic- and Cauchy-singular operators need to be dis- 
cretized in Sections [1614181 below, we use two modifications to composite 
Gauss-Legendre quadrature called local panelwise evaluation and local reg- 
ularization. See [13j Section 2] for a description of these techniques. 



16 The exterior Dirichlet Helmholtz problem 

Let D be the domain enclosed by the curve T and let E be the exterior to 
the closure of D. The exterior Dirichlet problem for the Helmholtz equation 

A£7(r) +oo 2 U{r) = 0, r G E , (51) 
lim U{r) = g(r°) , r° G T , (52) 

7-3-E— 

lim (J^ - iu?j U(r) = , (53) 



r — >oo 



can be modeled using a combined integral representation [HI Chapter 3] 

C/ ( r ) = I -^~®Ur, r')p{r') da r , - i / $ w (r, r')p{r') &a r , , r G R 2 \ T , 

(54) 

where $ w (r, r') is the fundamental solution to the Helmholtz equation in 
two dimensions 

* w (r,r') = ~H( l \u>\r - r'\) . (55) 

Hereff,^-) is a Hankel function of the first kind. Insertion of (|54p into (|52p 
gives the combined field integral equation 

(I + K u - iS u ) P {r) = 2g(r) , r G T , (56) 

where 

K u p(r) = 2 f J-$Ur, r')p{r') da r , , (57) 



Sup(r)=2j ^(r,r')p(r')da r 



(58) 
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Combined field integral equation GMRES iterations for full convergence 




Figure 14: The exterior Dirichlet problem for Helmholtz equation with RCIP 
applied to (j56l) . The program demoll.m is used with T as in JT]) and 8 = tt/2. 
The boundary condition g(r) of ([52]) is generated by a point source at (0.3, 0.1). 
Left: the absolute error in U(r) at r = (—0.1,0.2). Right: the number of 
GMRES iterations needed to meet an estimated relative residual of e mac h- 

Figure H3] shows the performance of RCIP applied to ([56]) for 1000 dif- 
ferent values of oj € [1, 10 3 ]. The program demoll .m is used. The boundary 
r is as in (pQ) with 9 = tt/2 and the boundary conditions are chosen as 
g(r) = Hq(v — r') with r' = (0.3, 0.1) inside T. The error in U(r) of (|54|) is 
evaluated at r = (—0.1,0.2) outside Y. Since the magnitude of U{r) varies 
with w, peaking at about unity, the absolute error is shown rather than 
the relative error. The number of panels on the coarse mesh is chosen as 
npan=0 . 55*omega+18 rounded to the nearest integer. 

17 The exterior Neumann Helmholtz problem 

The exterior Neumann problem for the Helmholtz equation 



AU(r) + oj 2 U(r) = 0, r G E , (59) 




can be modeled as an integral equation in several ways. We shall consider 
two options: an "analogy with the standard approach for Laplace's equa- 
tion", which is not necessarily uniquely solvable for all u, and a "regularized 
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Figure 15: The exterior Neumann problem for Helmholtz equation with RCIP 
applied to (j65l) . The program demol2.m is used with T as in {TJ and 9 = tt/2. 
The boundary condition g(r) of (|60]1 is generated by a point source at (0.3, 0.1). 
Left: the absolute error in U(r) at r = (—0.1,0.2). Right: the number of 
GMRES iterations needed to meet an estimated relative residual of e mac h- 

combined field integral equation" which is always uniquely solvable. See, 
further, [HE]. 

17.1 An analogy with the standard Laplace approach 

Let be the adjoint to the double-layer integral operator K u of (|5T|) 

K' u p(r) = 2 / 5-* w (r, r')p{r') doy, • (62) 

(63) 

Insertion of the integral representation 

U(r) = j $ u (r,r')p(r')da r , , rel 2 \l (64) 

into (|60p gives the integral equation 

(/ - K) p{r) = -2g(r) , r G T . (65) 

Figure [TS] shows the performance of RCIP applied to ([6"5]) for 1000 dif- 
ferent values of uj € [1, 10 3 ]. The program demol2 .m is used and the setup is 
the same as that for the exterior Dirichlet problem in Section [TjJJ A compar- 
ison with Figure [TJ] shows that the results are similar. The main difference 
being that the relative error in the solution U(r) is larger for the Neumann 
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problem than for the Dirichlet problem when uj happens to be close to values 
for which the integral operators in (|65p have a nontrivial nullspace. Recall 
that ([56]) is always uniquely solvable while ([65]) is not. 



17.2 A regularized combined field integral equation 

The literature on regularized combined field integral equations for the exte- 
rior Neumann problem is rich and several formulations have been suggested. 
We shall use the representation |7J 



U(r 



[ $ u (r, r')p(r') d<v+i f ^-^(r, O (S iuJ p) (r') da r , , r G R 2 \F , 
Jr Jr o^r' 

(66) 

which after insertion into (|60|) gives the integral equation 



{I-K- iTA) p(r) = -2g{r) , r G V , (67) 

where 

T u p(r) = 2— / — - $ w (r, r>(r') d<v . (68) 
Cz^y Jp CIV 

The hypersingular operator T w of (J6SJ) can be expressed as a sum of a 
simple operator and an operator that requires differentiation with respect 
to arc length only 



T u p(r) = 2u? [ ^(r,r'){v r -v r ,) P {r')&a r , + 2^- [ * w (r, r')^^- da r , . 
Jr day J r doy> 

(69) 

This makes it possible to write (|67p in the form 

(I + A + B X B 2 + dC 2 ) p{r) = -2g(r) , r G T , (70) 

where ^4 = — if', Z?2 = Skj, and the action of the operators B\, C%, and C 2 
is given by 



Sip(r) = -i2u 2 $Ur,r'){v r • ^r')p(^) d<V , (71) 

C 1/9 (r) = -12-^- / * w (r, r')p(r') da r , , (72) 
day J r 

C 2 p(r) = 2^- ( $ iw (r, r')p(r') day . (73) 

All integral operators in ()70p are such that their discretizations admit the 
low-rank decomposition (j!6p . We use the temporary expansion technique of 
Section [2] for (|70p . with two new layer densities that are later eliminated, 
to arrive at a single compressed equation analogous to (|3Hj) . That equation 
involves nine equisized blocks of the compressed inverse R. 
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Figure 16: The same exterior Neumann problem for Helmholtz equation as in 
Figure [T5l but RCIP is now applied to (j70l) . The program demol3b.m is used. 



Solving the problem in the example of Section 117.11 again, we take the 
number of panels on the coarse mesh as npan=0 . 6*omega+48 rounded to the 
nearest integer. Figure [TBI shows results from the program demol3b.m. The 
resonances, visible in Figure [T5l are now gone. It is interesting to observe 
in Figure [16] that, despite the presence of several singular operators and 
compositions in (jTUj) . the results produced with RCIP are essentially fully 
accurate and the number of GMRES iterations needed for convergence grows 
very slowly with cj. 

The program demol3c.m differs from demol3b.m in that it uses local 
regularization for the Cauchy-singular operators of (|72p and (j73|) rather than 
local panelwise evaluation. The results produced by the two programs are 
virtually identical, so we do not show yet another figure. 

18 Field evaluations 

Strictly speaking, a boundary value problem is not properly solved until 
its solution can be accurately evaluated in the entire computational do- 
main. The program demollb.m is a continuation of demoll.m which, after 
solving (f56|) for p coa with RCIP and forming p coa via ([25]) . computes the 
solution U(r) via (|54p using three slightly different discretizations: 

(i) When r is away from T, 16-point Gauss-Legendre quadrature is used 
in ([54"|) on all quadrature panels. 

(ii) When r is close to T, but not close to a panel neighboring a corner, 
16-point Gauss-Legendre quadrature is used in ([54")) on panels away 
from r and local panelwise evaluation is used for panels close to r. 
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Dirichlet: Log of estimated error in U{r) 




x 

Figure 17: The error in the solution U(r) to the exterior Dirichlet Helmholtz 
problem of Section [TJ] evaluated at 62392 points on a cartesian grid using 
demo 1 lb. m with ui = 10. The source at r' = (0.3,0.1) is shown as a blue star. 



Neumann: Log of estimated error in U(i) 




x 

Figure 18: Same as Figure[T7J but the exterior Neumann Helmholtz problem is 
solved using demol3d.m. The accuracy is even higher than in Figure [171 
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(iii) When r is close to a panel neighboring a corner, the density p coa is 
first used to reconstruct p part according to Section [9j Then 16-point 
Gauss-Legendre quadrature is used in (j5lj) on panels away from r and 
local panelwise evaluation is used for panels close to r. 

The first two discretizations only use the coarse grid on T. The third dis- 
cretization needs a grid on a partially refined mesh on T. 

The program demo 13d. m is a continuation of demo 13b. m which, after 
solving ([67]) with RCIP as described in Section fl 7. 2 \ computes the solution 
U (r) via (|66p using the three discretizations of the previous paragraph. 

Figure [T71 and Figure [TBI show that RCIP in conjunction with the modi- 
fied quadrature rules of [131 Section 2] is capable of producing very accurate 
solutions to exterior Helmholtz problems in, essentially, the entire compu- 
tational domain. 
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*** Appendicies *** 



Appendix A. Proof that P^P = I CO a 

Let f coa and gcoa be two column vectors, corresponding to the discretization 
of two panelwise polynomials with panelwise degree 15 on the coarse mesh 
of r. Then 

f C oaW coagcoa = (Pf coa ) T W fin (Pg coa ) = f£ a P T W fin Pg coa , (A.l) 

because composite 16-point Gauss-Legendre quadrature has panelwise poly- 
nomial degree 31. The diagonal matrix W coa has size 16n pan x 16n pan . 

Since there are 16n pan linearly independent choices of f coa and of g coa it 
follows from (jA.l[) that 

W coa = P T W fin P , (A.2) 
which, using ()13p . can be rewritten 

Icoa = W co 1 a P T W fin P = P^P . (A.3) 
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Appendix B. Derivation of the compressed equation 

The compression of (j5J), leading up to (fTT|) . was originally described in [I5J 
Section 6.4]. Here we give a summary. 

The starting point is © which, using the operator split analogous to (|llll2p 

K = K* + K° (B.l) 

and the variable substitution 

p(z) = (I + XK*)- 1 ~p(z) , (B.2) 
gives the right preconditioned equation 

p(z) + XK°(I + XK*)- 1 p{z) = Xg(z), zeT. (B.3) 

Now, let us take a close look at pOj) . We observe that K°(I + XK+y 1 
is an operator whose action on any function gives a function that is smooth 
on the innermost two panels of the coarse grid on T*. This is so since K° 
is constructed so that its action on any function gives a function that is 
smooth on the innermost two panels of the coarse grid on T*. Furthermore, 
the right hand side Xg(z) of (|B.3|) is assumed to be panelwise smooth. Using 
an argument of contradiction we see that p(z) has to be a panelwise smooth 
function on the innermost two panels of the coarse grid on r*. 

Having concluded that p(z) is panelwise smooth close to the corner we 
can write 

Pfin = P Pcoa- ( B - 4 ) 

We also have 

gfin = Pgcoa , (B.5) 

the discrete version of ()B.2p on the fine grid 

Pfin = (Ifm + AK4J- 1 p fin , (B.6) 

and the relations (|12p and (|16|) which we now repeat: 

K fin = K£ n + K£ n , (B.7) 

K fin = PK coa P W ■ ( B - 8 ) 

Substitution of (jB.4IB.5IB.6IB.7IB.8h into ©, which we now repeat: 

(I fln + AK fln ) p fin = Ag fin , (B.9) 

gives 

Pp coa + APK° oa P^ (I fin + AK^)- 1 Pp coa = Pg coa . (B.10) 

Applying Pj^ (or Q) to the left in rfB~TU|) and using the identities (|14p 
(or pOj) ) gives the compressed equation (fT?]) . 
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Appendix C. Derivation of the recursion 

The recursion (|2ip for the rapid construction of the diagonal blocks of the 
compressed weighted inverse R was originally derived in \19\ Section 7] using 
different notation and different meshes than in the present tutorial. The 
recursion was derived a second time in |20[ Section 7] using new meshes. 
Better notation was introduced in |13[ Section 6]. A third derivation, in a 
general setting, takes place in [TJ1 Section 5] and it uses the same notation 
and meshes as in the present tutorial. 

A problem when explaining the derivation of (|21|) is that one needs to 
introduce intermediate meshes and matrices whose appearance may cause 
enervation at a first glance. Particularly so since these meshes and matrices 
are not needed in the final expression ()2ip . We emphasize that the under- 
lying matrix property that permits the recursion is the low rank of certain 
off-diagonal blocks in discretizations of K° of (|B.lj) on nested meshes. 




Figure 19: Meshes of type a, type b, and type c on the boundary subset T* for 
i = 3 and n su b = 3. The type a mesh has 4 + 2i panels. The type b mesh has 
six panels. The type c mesh has four panels. The type a mesh is the restriction 
of the fine mesh to r*. For i = n su b, the type c mesh is the restriction of the 
coarse mesh to T*. The type a mesh and the type b mesh coincide for i = 1. 

The recursion (f2Tj) only uses uses one type of mesh explicitly - the type 
b mesh of Figure [5j On each V* there is a type b mesh and a corresponding 
discretization of K° denoted K° b . Here we need two new types of meshes 
denoted type a and type c, along with corresponding discrete operators. For 
example, Kj a is the discretization of K on a type a mesh on T*. The three 
types of meshes are depicted in Figure [TU1 Actually, a straight type c mesh 
was already introduced in Figure [H 

Now we define Rj as 

Rj = P^j ac (Ija + AKj a ) 1 Pj ac , (CI) 

where Pwi&c and P« a c are prolongation operators (in parameter) from a grid 
on a type c mesh on T* to a grid on a type a mesh on T*. Note that R; for 
i = n sub) according to the definition (jC.ip . is identical to the full diagonal 
64 x 64 block of R of (fTBj) . Note also that Ri comes cheaply. The rest of 
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this appendix is about finding an expression for Rj in terms of Rj-i that is 
cheap to compute. 

Let us split Kj a into two parts 

K, a = K* a + K° a , (C.2) 

where K* a = F{K(j_i) a } and K° a is such that 

K° a = Pj a bK° b P^ iab (C3) 

holds to about machine precision, compare (|16p . The prolongation operators 
Pj a b and Pwiab act from a grid on a type b mesh to a grid on a type a mesh. 
It holds that 

Piac — PiabPbc j 
PvKiac = PwiabPvKbc • (C5) 

Summing up, we can rewrite (jC.lj) as 

Pi = ^Wbc^Wiab (Xa + ^{^K(i_i) a } + APj a bK° b P^ iab ) Pj a bPbc • 

(C.6) 

The subsequent steps in the derivation of (j2Tj) are to expand the term 
within parenthesis in (|C,6p in a Neumann series, multiply the terms in this 
series with P^j ab from the left and with P« a b from the right, and bring the 
series back in closed form. The result is 



R-i — PpJ/bc 



(p^ ab (l, a + F{AK (j _ 1)a }) X P iab ) 1 + AK° b 



Pbc : 

(C.7) 



which, in fact, is (|2ip in disguise. To see this, recall from (jC.lj) that 



R(i_l) = P^/(j_i) ac (I(i-l)a + ^ K (i-l)a) 1 P(i-i)ac • (C8) 



Then 



F { R (i-l)} = F { P W(j-l)ac ( X (i-l)a + AK (i _ 1)a ) 1 P(j_i) ac } 

= P^ iab (l ia + F{AK( i _ 1 ) a }) -1 P iab - II , (C.9) 

where the second equality uses P^j ab P«ab = Ib> see Appendix A. Substitu- 
tion of (|C.9p in ()C.7p gives the recursion in the familiar form 

R, = P^ bc (F{Rr\} + 1° + AK^)" 1 P bc . (CIO) 
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Appendix D. An inner product preserving scheme 



In [2], Bremer describes a scheme that stabilizes the solution to the dis- 
cretized system ([8]) on the fine mesh. The scheme can be interpreted as an 
inner product preserving discretization. In practice it corresponds to making 
a similarity transformation of the system matrix. While inner product pre- 
serving Nystrom discretization elegantly solves problems related to stability 
(the condition number of the system matrix is improved) it does not reduce 
the number of discretization points (unknowns) needed to achieve a given 
precision in the solution. Neither does it affect the spectrum of the system 
matrix (similarity transformations preserve eigenvalues) and hence it does 
not in any substantial way improve the convergence rate of the GMRES 
iterative method [301 Lecture 35]. 

For completeness, we have implemented inner product preserving Nystrom 
discretization in the program demo Id. m. The program is a continuation of 
demo lb .m where we also have replaced (J3j) with the more stable integral equa- 
tion (|24p . This should facilitate comparison with the program demo3b .m and 
the results shown in Figure [7J 

Convergence of q with mesh refinement GMRES iterations for full convergence 
10° | ■ ■ ■ 1 10 2 i 1 ■ , 




20 40 60 80 100 20 40 60 80 100 

Number of subdivisions n „ Number of subdivisions n . 

sub sub 



Figure 20: Same as Figure [7J but the program demold.m is used. 

Figure l20l shows results produced by demold.m. Beyond n su b = 60 one 
now achieves essentially full machine precision in q of Q. Despite this 
success, inner product preserving Nystrom discretization can perhaps not 
quite compete with the RCIP method in this example. The differences in 
performance relate to issues of memory and speed. The RCIP method uses 
a much smaller linear system (16n pan unknowns) than does inner product 
preserving Nystrom discretization (16(n pan + 2n su b) unknowns). Besides, 
the RCIP method converges in only eight GMRES iterations, irrespective of 
n sub- See Figure [7J 
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Appendix E. The nature of layer densities in corners 



It is considered difficult to solve integral equations on piecewise smooth 
boundaries. One reason being that solutions (layer densities) often have 
complicated asymptotics close to corner vertices. Also stable numerical 
schemes, such as that of Appendix D, are burdened with the task of res- 
olution. It takes a lot of mesh refinement to resolve non-polynomiahlike 
layer densities with polynomial basis functions. 

Sometimes layer densities diverge, sometimes they stay bounded, but 
they are seldom polynomial-like close to corner vertices. While we refrain 
from giving a characterization of what properties of Fredholm second kind 
integral equations give rise to unbounded solutions we observe the following: 

• The solution p{r) to ([3]) diverges in corners whenever !ft{A} > and A 
is such that p(r) exists. Figure [9] illustrates this for A = 0.999. 

• The size of the zone where the asymptotic shape of a layer density 
is visible depends on the smoothness of the solution to the PDE that 
underlies the integral equation. See [131 Figure 3]. 

• The divergence of layer densities close to corner vertices can be very 
strong. In classical material science applications it is not unusual with 
layer densities that barely lie in 1? |19| Section 2]. In metamaterial 
applications layer densities may barely lie in L 1 [161 Section 3]. Fur- 
thermore, it is likely that L p spaces are not the most natural function 
spaces for the characterization of layer densities |2H Section 5] . 

Variable separation is often used to predict the asymptotic behavior of 
layer densities close to corner vertices. See, for example, [191 Section 2] 
and [TH Section 2]. But perhaps asymptotic studies of layer densities are 
not that important? Assume that there are two different Fredholm second 
kind integral equations for the same underlying PDE: one where the layer 
density is bounded and one where it is unbounded. If the RCIP method 
is to be used, it really does not matter which equation is chosen. The 
recursion (|21I22|) might need fewer steps for the equation with a bounded 
solution. The achievable accuracy for functionals of the layer density, on 
the other hand, is often slightly higher if computed from the unbounded 
solution. The reason being, loosely speaking, that more information can be 
contained in a rapidly varying function than in a slowly varying function. 
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